Conversation
|
We now use the snp_effects() function... didnt know that existed, but makes sense |
|
I also add grey lines to the amino acid plot of the diplotype clustering functions - it looks a lot neater. And the haplotype function can now support multiple transcripts. |
|
This is super cool, @sanjaynagi. Thanks very much. I am struggling with bandwidth this week but have asked @leehart and @jonbrenas to take a look so we can merge asap. :) |
|
Awesome, thanks @ahernank |
|
Just adding a corresponding issue for this #819 |
|
Many thanks @sanjaynagi - This is great! Pylint is telling me there's a potential glitch around the following code in the # Get SNP genotype allele counts for the transcript, applying snp_query
df_eff = (
self.snp_effects(
transcript=transcript,
)
.query(snp_query)
.reset_index(drop=True)
)Raises problem:
I gather the issue is that class AnophelesSnpFrequencyAnalysis(AnophelesSnpData, AnophelesFrequencyAnalysis):Obviously all the tests pass, and the examples in the notebooks appear to work, so I don't quite understand why something like an We should probably check to see whether this warning from Pylint is right, anyhow. We might need to modify the inheritance chain, e.g. adding |
|
|
||
| # Get SNP genotype allele counts for the transcript, applying snp_query | ||
| df_eff = ( | ||
| self.snp_effects( |
There was a problem hiding this comment.
Pylint reports:
Instance of 'AnophelesHapClustAnalysis' has no 'snp_effects' member
There was a problem hiding this comment.
Ok. also not clear why the code was working - but i have added the Frequency class to hapclust class now
There was a problem hiding this comment.
Thanks @sanjaynagi . The snp_effects function is in AnophelesSnpFrequencyAnalysis rather than AnophelesFrequencyAnalysis, and AnophelesFrequencyAnalysis only inherits from AnophelesBase, so probably won't make snp_effects available to AnophelesHapClustAnalysis, if I understand correctly.
Also, since AnophelesSnpFrequencyAnalysis inherits from AnophelesSnpData, I suspect we might need to put it before AnophelesSnpData in the AnophelesHapClustAnalysis base classes list. That made we wonder whether we need AnophelesSnpData there at all, but when I tried removing it locally, I got a method resolution error.
Unfortunately, I'm starting to suspect that there might be a larger architectural issue here, around inheritance, namely https://en.wikipedia.org/wiki/Multiple_inheritance#The_diamond_problem
There was a problem hiding this comment.
Oops, my bad. I will replace the import of AnophelesFrequencyAnalysis and see what happens
…alariagen/malariagen-data-python into advanced-haplotype-clustering-05-09-25
|
Also, here is a notebook with a demo of the function https://colab.research.google.com/drive/1HqDJL0cz9gAKaeXdwYjcGXx-X1GYJznL?usp=sharing |
|
Looks like we're still waiting to try using I believe class AnophelesHapClustAnalysis(AnophelesSnpFrequencyAnalysis, AnophelesHapData, AnophelesSnpData):We might be able to address the potential diamond inheritance problem in a separate issue. Otherwise, I reckon this looks good to go. |
|
Hi @sanjaynagi. Is it OK if I bring this PR's branch up-to-date with the base branch, and take a closer look at these CI test failures? |
|
Yes please do!
…On Thu, 8 Jan 2026, 09:34 Lee, ***@***.***> wrote:
*leehart* left a comment (malariagen/malariagen-data-python#817)
<#817 (comment)>
Hi @sanjaynagi <https://github.com/sanjaynagi>. Is it OK if I bring this
PR's branch up-to-date with the base branch, and take a closer look at
these CI test failures?
—
Reply to this email directly, view it on GitHub
<#817 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AIKN6HOLZ4BKSNBYT7SI63D4FYQCXAVCNFSM6AAAAACFYIVBYOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTOMRTGAYDQMRXHE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
|
Thanks @sanjaynagi . Looks like we have a couple of mypy linting errors in the CI checks https://github.com/malariagen/malariagen-data-python/actions/runs/20814001998/job/59784955756?pr=817 I also noticed that we have: class AnophelesHapClustAnalysis(AnophelesHapData, AnophelesSnpData, AnophelesFrequencyAnalysis)instead of class AnophelesHapClustAnalysis(AnophelesSnpFrequencyAnalysis, AnophelesHapData, AnophelesSnpData):In any case, we are still running into issue #835 at https://github.com/malariagen/malariagen-data-python/actions/runs/20814001969/job/59784955727?pr=817 ...which is preventing us from the CI checks & tests completing. This segfault issue exists outside the changes in this PR. |
|
Noting linting failures:
These can be addressed in similar ways to pending PRs #842 and #844 However, the segfault error is still sporadic and unresolved, via issue #835 and PR #842 |
This PR adds advanced haplotype clustering, allowing users to cut the tree and assign clusters, and to plot haplotype data (amino acid mutations by default) below the tree.
TODO