为Unity3d编写最简单的newick解析器(c#或Actionscript)
本文关键字:Actionscript Unity3d 最简单 newick | 更新日期: 2023-09-27 18:06:34
我正试图找出如何读取Newick文件为许多动物物种,我还没有能够找到一个"逻辑方法/过程"排序Newick字符串在一个简单的编程语言。我可以阅读c#和AS和JS和GLSL和hsl。
我找不到任何简单的资源,维基文章甚至没有讨论递归。如何解析newick的伪代码将是如此伟大,我找不到一个。
有人知道在Unity3d中读取newick文件的最快方法吗?你能帮我设置一个正确的逻辑过程来整理新代码吗,即:
(A,B,(C,D));
分支长度号码暂时不重要。
目标项目文件:
(
(
(
(
(
(
Falco_rusticolus:0.846772,
Falco_jugger:0.846772
):0.507212,
(
Falco_cherrug:0.802297,
Falco_subniger:0.802297
):0.551687
):0.407358,
Falco_biarmicus:1.761342
):1.917030,
(
Falco_peregrinus:0.411352,
Falco_pelegrinoides:0.411352
):3.267020
):2.244290,
Falco_mexicanus:5.922662
):1.768128,
Falco_columbarius:7.69079
)
如果没有正式语法背景,实现解析器可能会很困难。因此,最简单的方法似乎是使用解析器生成器,比如ANTLR,然后您只需要熟悉语法表示法。您可以根据语法生成用c#编写的解析器。
幸运的是,你可以在网上找到一个新的语法:这里。
更新:
如果你执行了上面的操作,那么你将得到如下内容:
public class Branch
{
public double Length { get; set; }
public List<Branch> SubBranches { get; set; } = new List<Branch>();
}
public class Leaf : Branch
{
public string Name { get; set; }
}
public class Parser
{
private int currentPosition;
private string input;
public Parser(string text)
{
input = new string(text.Where(c=>!char.IsWhiteSpace(c)).ToArray());
currentPosition = 0;
}
public Branch ParseTree()
{
return new Branch { SubBranches = ParseBranchSet() };
}
private List<Branch> ParseBranchSet()
{
var ret = new List<Branch>();
ret.Add(ParseBranch());
while (PeekCharacter() == ',')
{
currentPosition++; // ','
ret.Add(ParseBranch());
}
return ret;
}
private Branch ParseBranch()
{
var tree = ParseSubTree();
currentPosition++; // ':'
tree.Length = ParseDouble();
return tree;
}
private Branch ParseSubTree()
{
if (char.IsLetter(PeekCharacter()))
{
return new Leaf { Name = ParseIdentifier() };
}
currentPosition++; // '('
var branches = ParseBranchSet();
currentPosition++; // ')'
return new Branch { SubBranches = branches };
}
private string ParseIdentifier()
{
var identifer = "";
char c;
while ((c = PeekCharacter()) != 0 && (char.IsLetter(c) || c == '_'))
{
identifer += c;
currentPosition++;
}
return identifer;
}
private double ParseDouble()
{
var num = "";
char c;
while((c = PeekCharacter()) != 0 && (char.IsDigit(c) || c == '.'))
{
num += c;
currentPosition++;
}
return double.Parse(num, CultureInfo.InvariantCulture);
}
private char PeekCharacter()
{
if (currentPosition >= input.Length-1)
{
return (char)0;
}
return input[currentPosition + 1];
}
}
可以这样使用:
var tree = new Parser("((A:1, B:2):3, C:4)").ParseTree();
顺便说一句,上面的解析器实现了以下语法,没有任何错误处理:
Tree -> "(" BranchSet ")"
BranchSet -> Branch ("," Branch)*
Branch -> Subtree ":" NUM
Subtree -> IDENTIFIER | "(" BranchSet ")"
希望您对将Newick转换为JSON/常规对象感兴趣,我想我能够找到解决方案。
一个快速谷歌给了我链接到JS上的实现:
https://www.npmjs.com/package/biojs-io-newick
https://github.com/daviddao/biojs-io-newick
并且,将JS代码移植到AS3中对我来说并不难:
// The very funciton of converting Newick
function convertNewickToJSON(source:String):Object
{
var ancestors:Array = [];
var tree:Object = {};
var tokens:Array = source.split(/'s*(;|'(|')|,|:)'s*/);
var subtree:Object;
for (var i = 0; i < tokens.length; i++)
{
var token:String = tokens[i];
switch (token)
{
case '(': // new children
subtree = {};
tree.children = [subtree];
ancestors.push(tree);
tree = subtree;
break;
case ',': // another branch
subtree = {};
ancestors[ancestors.length-1].children.push(subtree);
tree = subtree;
break;
case ')': // optional name next
tree = ancestors.pop();
break;
case ':': // optional length next
break;
default:
var x = tokens[i-1];
if (x == ')' || x == '(' || x == ',')
{
tree.name = token;
} else if (x == ':')
{
tree.branch_length = parseFloat(token);
}
}
}
return tree;
};
// Util function for parsing an object into a string
function objectToStr(obj:Object, paramsSeparator:String = "", isNeedUseSeparatorForChild:Boolean = false):String
{
var str:String = "";
if (isSimpleType(obj))
{
str = String(obj);
}else
{
var childSeparator:String = "";
if (isNeedUseSeparatorForChild)
{
childSeparator = paramsSeparator;
}
for (var propName:String in obj)
{
if (str == "")
{
str += "{ ";
}else
{
str += ", ";
}
str += propName + ": " + objectToStr(obj[propName], childSeparator) + paramsSeparator;
}
str += " }";
}
return str;
}
// One more util function
function isSimpleType(obj:Object):Boolean
{
var isSimple:Boolean = false;
if (typeof(obj) == "string" || typeof(obj) == "number" || typeof(obj) == "boolean")
{
isSimple = true;
}
return isSimple;
}
var tempNewickSource:String = "((((((Falco_rusticolus:0.846772,Falco_jugger:0.846772):0.507212,(Falco_cherrug:0.802297,Falco_subniger:0.802297):0.551687):0.407358,Falco_biarmicus:1.761342):1.917030,(Falco_peregrinus:0.411352,Falco_pelegrinoides:0.411352):3.267020):2.244290,Falco_mexicanus:5.922662):1.768128,Falco_columbarius:7.69079)";
var tempNewickJSON:Object = this.convertNewickToJSON(tempNewickSource);
var tempNewickJSONText:String = objectToStr(tempNewickJSON);
trace(tempNewickJSONText);
上面的代码给出了下一个跟踪:
{ name: , children: { 0: { name: , children: { 0: { name: , children: { 0: { name: , children: { 0: { name: , children: { 0: { name: , children: { 0: { name: Falco_rusticolus, branch_length: 0.846772 }, 1: { name: Falco_jugger, branch_length: 0.846772 } }, branch_length: 0.507212 }, 1: { name: , children: { 0: { name: Falco_cherrug, branch_length: 0.802297 }, 1: { name: Falco_subniger, branch_length: 0.802297 } }, branch_length: 0.551687 } }, branch_length: 0.407358 }, 1: { name: Falco_biarmicus, branch_length: 1.761342 } }, branch_length: 1.91703 }, 1: { name: , children: { 0: { name: Falco_peregrinus, branch_length: 0.411352 }, 1: { name: Falco_pelegrinoides, branch_length: 0.411352 } }, branch_length: 3.26702 } }, branch_length: 2.24429 }, 1: { name: Falco_mexicanus, branch_length: 5.922662 } }, branch_length: 1.768128 }, 1: { name: Falco_columbarius, branch_length: 7.69079 } } }
因此,这种方法提供了一种处理Newick格式的方法,就像处理JSON一样。
根据标题,你不仅对c#感兴趣,而且对AS3的实现也感兴趣(我不确定你是否能够在c#中"开箱"地使用它,但也许你将能够将它移植到c#中)。